For my corpus I have decided to focus on famous cover songs. To me this is a very interesting topic because it shows how different people can express the “same” song in such a way that it is fundamentally different. It shows that the music, rhythm, pitch, timbre, loudness and whatever else there might be introduced by an artist to express a song has significant impact on the experience of a listener, even though the “story” told by the text is exactly the same.
As you will definitely notice through my corpus the originals in relation to their cover songs do not necessarily have a very easy way to be represented as groups. The reason for this is the fact that other than the versions of the same song, the songs are unrelated. Except of course for the fact that an maybe overly large portion of my songs was originally written by Bob Dylan. I have decided not to include any relationships between the Bob Dylan songs and the different covers, this is because I did not consider this to be the goal of this portfolio. However these considerations and the poor connection between the song do impact the validity of this corpus with respect to external factors.
A small issue other than the Bob Dylan presentation is the slight bias you would find in my corpus. Obviously over the years there have been countless songs with numerous covers, which are impossible to include, the ones that I’ve chosen are just the ones that I knew of and I enjoy. Therefore it is not necessarily a strong representation of originals and cover songs.
Introduction
Tempo and Novelty
Waveform Analysis
Self Similarity
Pitch and Timbre Comparison
Some Chords
Machine Learning
Listen in!/Conclusion
To create an overview of the main differences between the songs I have connected them with an arrow which can be seen in the next tab. Using coloring, size and ggplotly it was not willing to draw all the lines, therefore I have included two plots, to show which songs are included and connected and one with more information about the actual songs. This plot shows the fist (general) overview of the songs in the corpus. The songs are quite spread, however we see a more dence cluster in the left quartile (0 < Valence <0.5 , 0 < Energy <0.5) meaning we have more slower/more “sad” songs. We see a large distribtuin of keys which is interesting, also because in general these distributions are between different songs. Meaning that in general the different versions of the same song are in the same key.
Here we have a little loss of information, however we gain insight in which way the songs (original and cover(s)) are related to each other. We can see that the Dancing in the moonlight versions for example differ quite a bit with regard to energy and valance, where the Resolution songs differ just a lot in energy, but have nearly the same valence. Intersting I think is the Knocking on heavens door connection, where we see between one version an increasing rise in valence and Energy between two others and both an increase in valence and energy between the first and last one (most left versus most right). Since we lost insight in Key and Tempo I will just tell you that the most right version is the Guns ’n Roses version, which is also in a different key than the two others. Considering this difference we will focus on these songs for the further portfolio.
Something else we will visit, is the Sound of Silence versions. These two are surprisingly close together. While the Simon and Garfunkel version sounds way more upbeat and happy to me. In general they have a total different feel to me.
A really interesting distribution difference here, the tempo’s are both centered around 120 bpm, however we see that the Original versions have a way earlier cut of around 85, where the Cover version go as far down as 65 bpm. The track this corresponds to is Make you Feel my Love by Adele (cover), which is interesting, because it is not the track with the lowest valence or energy, telling us that the spotify API does not necessarily consider tempo as the main factor for energy and valance. Therefore we will review this two for our two chosen songs.
Even though the Bob Dylan version and the Eric Clapton version have a much slower feel that the Guns ’n Roses version, the tempograms do not seem to differ to much in BPM. I have tried verifying the tempo’s through Google because of this reason. The tempo estimations for the Bob Dylan version and the Guns ’n roses version seem to vary rather much somewhere between 65 and 140 bpm, however for the Eric Clapton version they seem pretty consistent. Which I thought was interesting considering this shows two pretty strong tempo “powers”. I think this might be caused by some rather silent percussion in the back, but I’m not sure of this. Something else I think is interesting to note is the tempo increase in the beginning and tempo decrease at the end in the Guns ’n Roses version, this change is clearly audible, which to me might be an indication that the estimations are rather accurate. Nonetheless these tempo estimations shown here are an interesting indication that the tempo more or less remained the same over the differnet cover versions.
The tempogram for the disturbed version looks really interesting as it is a little all over the place, but very clear around 80. Interesting as well is that it seems to be going up consistently. Why this is I am not sure. However the little up and down just over 200 seconds can be caused by the introduction of extra instruments and especially percussion, then after that there is a clear fade out. The tempogram for the Simon and Garfunkel version is really consistent and seems to be very accurate. There is no real surprises here. Also clearly a fade out at the end. What is surprising to me is the appearant accuracy of the estimation. The spotify API is more or less (if we would average this) spot on.
Even though the grids vary (pay attention to different x and y scales), the degree of novelty does not seem to differ significantly between the versions. We see one outlier in the guns n roses graph that seems to marginalize the other values around 240. This is a little spike after a silent part, where they start singing “knock knock knocking” again. Other than that we see in general novelties with a maximum of 15. Especially in the Eric Clapton version. Which is expected since this is a rather consistent song.
As you can see I’ve included my waveform comparisons between different versions of Knocking on heavens door. I’ve chosen this one, because it shows absolutely no similarity what so ever. For the Bob Dylan version this makes sense as it is significantly shorter than the other version. However, the Eric Clapton version and the Guns ’N Roses version are very similar in duration. I think this is a very interesting representation on how different the “same” songs can be. Nonetheless listening to these songs, this is hardly surprising. Especially if we also take into consideration these songs are in different keys. will try to show this more with graphs in the future, as making waveform analysis for all songs is not feasible nor informative. I tried showing them in a rows overview, which is nicer, however because of the code that is shown this really messes up the overview.
As is clearly visible looking at the diagonal lines in both Chromagrams there is a significant amount of repetetion in both versions. This is not very surpirsing, this is quite characteristic of the Simon and Garfunkel songs in general and with respect to pitches and lyrics Disturbed did not make significant changes. However interesting I believe is the massive blocks visible in the Timbre graph for the disturbed version. In these blocks, there are some introduction of (quite strong) new percussion every now and then. Yet the Timbre seems to be dominated by the voice, which is of course quite present during the entire song. Personally in think this is therefore also the reason that we experience these songs so significantly different.
I have decided not to include the Guns ’n Roses self-similarity graphs. The reason for this is that they show very little comparison and there is not much to say about them in relation to each other, other than that they are as well relatively repetitive.
I have decided just to include the pitch comparisons for the Sound Silence, as the one for Guns ’n Roses was more or less to be expected and we will revisit it when we look at the chords and keys. I assumed the pitches for the Sound of Silence version to be quite similar. Of course because of the similar keys, but also because in the beginning a guitar for the Simon and Garfunkel version and in the Disturbed version a piano quite distinctively plays similar pitches. The Sound of Silence is in the key of D-minor (relative F-major). Which we can see by relative strong magnitudes in D, E, and A. Interesting thought is that it seemed to have missed the F and instead measures a lot of activity in F#, which is not in key. This even though the F-chord is part of the song.
We see that the distribution for the timbre features really is all over the place. The timbre for C01 and C02 are centered, and the at the end they seem to be more centered as well. As I have no clear interpretation of what these Timber features entail I’ll try to see if referring to this plot we can analyze the songs regarding to the rest of the Corpus.
We see some expected behavior for the timbre analysis. C01 and C02 are quite strongly and consistently represented. Something that is normal for the corpus. Interesting is the clear build of in representation of timbre features. They seems to have relatively high presence of timbre features up to C05. Which is not necessarily to be expected looking at the previous graph. However it is important to keep in mind these timbre representations are a relative value.
We can see a bigger magnitude for the Guns ’n Roses song in more timbre features than for the other two songs. This is to be expected when listening to these songs. The Guns ’n Roses version has the most variation in it, and actually sounds “biggest”. The Eric Clapton version is much more calm and steady. Where the Bob Dylan version is actual pretty quite and steady. Which is since its much smaller length also to be expected. Interesting as well though is that these songs show significantly more intensity in a lot of Timbre features than the Sound of Silence versions. This however might also be a relative thing, considering the consistency we noticed in the Chroma and Timbre self similarity graphs we saw in the previous part for the Sound of Silence
Having looked at the pitches for the Sound of Silence, I think it is now interesting to look at keys and chords for Guns ’n Roses, since they differ from each other. First we’ll have a look at the chords used in the versions and see if they match the expected keys.
Now google tells me that for both the Bob Dylan and Eric Clapton version the chords should be G, D, Am and C. The Guns ’n Roses version has G5, D, C, Am7, very small difference, yet key is interpreted significantly different.
At a first glance even it is interesting how the Bob Dylan version and Eric Clapton version shows so many similarities. Looking at it more closely, we can see very strong presence of the Am chord through the entire song for both versions. Both the G and C chords also (even though slighty less) have a strong pressence. The D-maj however seems to be less present, yet still clearly visible. I am not entirely sure why this is the case. There might be a music theory explanation regarding the frequencies of these chords, but my knowledge does not extend that far. The Guns’ n Roses version is clearly distinctive from the other two. Interesting here is the more D-maj and D7 which are strongly present. Obviously regarding the key here, this is to be expected hower, the D7 we would have expected to be an Am7.
Maybe even more interesting than the chords used in the different songs are the keys. As we know the Guns ‘n Roses version is one of the few songs in the corpus that differs in key from the original version. The difference in key is also clearly visible. Interesting I think is the that we can see that the Bob Dylan version and the Eric Claptoon version show very similar keys, but instead of showing a general G-major (relative E-minor) it shows a very strong appearance at Gb-major key (relative Eb-minor), which is in fact the F#-major that the Guns ’n Roses version is in, nonetheless the strength remains quite high at the actual G-major key. Similarly the Gb-major/D#-minor are present for the Guns’ n Roses version, but we see at least as strong magnitudes around C-minor and C#-minor. Considering the distance between these keys (in the circle of fifths) I am not entirely certain why the key-analyses shows this representation.
My classifier trying to predict whether something is a original song or not does not do well at all. It has more or less a random accuracy. This is ofcourse not weird. There are no actual features that can predict whether the song is an original or not. We might include the date (which would of course be a very good indication) but this would nullafy the intention of the process, which is to see if there is a notable difference between original and cover songs. What might be an interesting feature to be added to the spotify API might be sound recording quality (if there could be such a measurement). This would probably be a good indication for when it was recorded and therefore for whether its a cover song or not.
As can be seen here I’ve chosen to extracted features that I consider to be meaningful to people and have left out the C#|D (which was impossible to use since r would consider it a comment) and went for more indicating features, which I’ve plotted in the second graph. As can be seen the features are all over the place, however we see some small clusters of just Covers and of just Originals. I am assuming that these smaller clusters are the reason that some of the classification was done correctly. Lets see if we have some more luck with the clustering!
# A tibble: 2 x 3
class precision recall
<fct> <dbl> <dbl>
1 No 0.5 0.5
2 Yes 0.435 0.435
We can observe some of the songs to be in the same clusters. In general I think it would be a reasonable outcome of the clustering, most songs are in at least closely related clusters. I am not sure if this is a reasonable outcome, but I would say that it is the best we could have hoped for. Interesting is for example the All along the watchtower which is isolated at the bottom and very far away from the original version. Also interesting is looking back at the knocking on heavens door songs, where 2 have been clustered next to eachother (Dylan and Eric Clapton) but the third has been set apart (Guns ’n Roses). Which is not that strange since the diffent key used, as well as the different chords.
For a better idea of the songs I have been analysing here you can briefly listen to them. I particularly recommend listening briefly to the differences between The sound of Silence version. These are in my opinion significantly more different than they appear from the data. As well as of course just incredible songs, so always worth a listen.
As we have seen through my portfolio even though keys and even pitches can be the same, the songs and the way we experience them can be immensely different, just because a different artist gave a different “swing” to it. I have found it difficult to express this fact in data. Therefore have been more focused on expressing similarities in contract with the differences. I hope that having followed the course of this portfolio this idea of the differences between similar songs also reached you. I myself have been quite surprised by the marginal differences in data, with the such immense impact on feeling. Especially when considering the differences in the Sound of Silence versions. When I first heard the Disturbed version I was extremely amazed, with what they did with it, yet when really listening to it, the changes are marginal as also can be viewed from the data. Especially the timbre data might be a real good indication for this.
I regret not having the opportunity to look at all the songs. Yet I believe that the contrast by analyzing two songs which are placed closed together (Sound of Silence versions) and the Knocking on heavens door versions which are relative far apart and in big contrast with each other, I have been able to show the difference of impact of the songs. Allthough I do understand that a lot of decisions here were my own interpretation, I hope to one day look at it again with a slightly more “musical theory eye”. Especially to understand the differences in key and chord usage. Especially since the chords usage of the Knocking on heavens door is so similar, yet there are different keys and completely different sounds. Where you could definitely hear the similarities with the Sound of Silence I could absolutely not do this with Knocking on heavens door. In hindsight it probably would have been better to include a large portion of the same songs instead of a lot of different covers of many different songs, to gain even better insight in these differences.
Concluding my portfolio I would like to add a note on the classification and clustering. As was visible and mentioned before, the classification was absolutely horrible as this was to be expected since there is no way of knowing if something is a original or a cover song (unless for looking at the date. What is interesting is that the clustering did better. Meaning that because of the statistical similarities in these songs, these clustering algorithms are more or less capable of at least placing these songs somewhat together. Where I do not believe, especially in certain cases, that if we removed the lyrics and it’s meaning, a human would be capable of doing so.